{
 "metadata": {
  "name": "",
  "signature": "sha256:762906f449424ec70e2324ab7310748fcbe613b98d4fdbfbe3c219b793aef8a3"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "# Setting up Python for machine learning: scikit-learn and IPython Notebook\n",
      "*From the video series: [Introduction to machine learning with scikit-learn](https://github.com/justmarkham/scikit-learn-videos)*"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Agenda\n",
      "\n",
      "- What are the benefits and drawbacks of scikit-learn?\n",
      "- How do I install scikit-learn?\n",
      "- How do I use the IPython Notebook?\n",
      "- What are some good resources for learning Python?"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "![scikit-learn algorithm map](images/02_sklearn_algorithms.png)"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Benefits and drawbacks of scikit-learn\n",
      "\n",
      "### Benefits:\n",
      "\n",
      "- **Consistent interface** to machine learning models\n",
      "- Provides many **tuning parameters** but with **sensible defaults**\n",
      "- Exceptional **documentation**\n",
      "- Rich set of functionality for **companion tasks**\n",
      "- **Active community** for development and support\n",
      "\n",
      "### Potential drawbacks:\n",
      "\n",
      "- Harder (than R) to **get started with machine learning**\n",
      "- Less emphasis (than R) on **model interpretability**\n",
      "\n",
      "### Further reading:\n",
      "\n",
      "- Ben Lorica: [Six reasons why I recommend scikit-learn](http://radar.oreilly.com/2013/12/six-reasons-why-i-recommend-scikit-learn.html)\n",
      "- scikit-learn authors: [API design for machine learning software](http://arxiv.org/pdf/1309.0238v1.pdf)\n",
      "- Data School: [Should you teach Python or R for data science?](http://www.dataschool.io/python-or-r-for-data-science/)"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "![scikit-learn logo](images/02_sklearn_logo.png)"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Installing scikit-learn\n",
      "\n",
      "**Option 1:** [Install scikit-learn library](http://scikit-learn.org/stable/install.html) and dependencies (NumPy and SciPy)\n",
      "\n",
      "**Option 2:** [Install Anaconda distribution](https://store.continuum.io/cshop/anaconda/) of Python, which includes:\n",
      "\n",
      "- Hundreds of useful packages (including scikit-learn)\n",
      "- IPython and IPython Notebook\n",
      "- conda package manager\n",
      "- Spyder IDE"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "![IPython header](images/02_ipython_header.png)"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Using the IPython Notebook\n",
      "\n",
      "### Components:\n",
      "\n",
      "- **IPython interpreter:** enhanced version of the standard Python interpreter\n",
      "- **Browser-based notebook interface:** weave together code, formatted text, and plots\n",
      "\n",
      "### Installation:\n",
      "\n",
      "- **Option 1:** [Install IPython and the notebook](http://ipython.org/install.html)\n",
      "- **Option 2:** Included with the Anaconda distribution\n",
      "\n",
      "### Launching the Notebook:\n",
      "\n",
      "- Type **ipython notebook** at the command line to open the dashboard\n",
      "- Don't close the command line window while the Notebook is running\n",
      "\n",
      "### Keyboard shortcuts:\n",
      "\n",
      "**Command mode** (gray border)\n",
      "\n",
      "- Create new cells above (**a**) or below (**b**) the current cell\n",
      "- Navigate using the **up arrow** and **down arrow**\n",
      "- Convert the cell type to Markdown (**m**) or code (**y**)\n",
      "- See keyboard shortcuts using **h**\n",
      "- Switch to Edit mode using **Enter**\n",
      "\n",
      "**Edit mode** (green border)\n",
      "\n",
      "- **Ctrl+Enter** to run a cell\n",
      "- Switch to Command mode using **Esc**\n",
      "\n",
      "### IPython and Markdown resources:\n",
      "\n",
      "- [nbviewer](http://nbviewer.ipython.org/): view notebooks online as static documents\n",
      "- [IPython documentation](http://ipython.org/ipython-doc/stable/index.html): focuses on the interpreter\n",
      "- [IPython Notebook tutorials](http://nbviewer.ipython.org/github/ipython/ipython/blob/master/examples/Notebook/Index.ipynb): in-depth introduction\n",
      "- [GitHub's Mastering Markdown](https://guides.github.com/features/mastering-markdown/): short guide with lots of examples"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Resources for learning Python\n",
      "\n",
      "- [Codecademy's Python course](http://www.codecademy.com/en/tracks/python): browser-based, tons of exercises\n",
      "- [DataQuest](https://dataquest.io/missions): browser-based, teaches Python in the context of data science\n",
      "- [Google's Python class](https://developers.google.com/edu/python/): slightly more advanced, includes videos and downloadable exercises (with solutions)\n",
      "- [Python for Informatics](http://www.pythonlearn.com/): beginner-oriented book, includes slides and videos"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Comments or Questions?\n",
      "\n",
      "- Email: <kevin@dataschool.io>\n",
      "- Website: http://dataschool.io\n",
      "- Twitter: [@justmarkham](https://twitter.com/justmarkham)"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "from IPython.core.display import HTML\n",
      "def css_styling():\n",
      "    styles = open(\"styles/custom.css\", \"r\").read()\n",
      "    return HTML(styles)\n",
      "css_styling()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<style>\n",
        "    @font-face {\n",
        "        font-family: \"Computer Modern\";\n",
        "        src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunss.otf');\n",
        "    }\n",
        "    div.cell{\n",
        "        width: 90%;\n",
        "/*        margin-left:auto;*/\n",
        "/*        margin-right:auto;*/\n",
        "    }\n",
        "    ul {\n",
        "        line-height: 145%;\n",
        "        font-size: 90%;\n",
        "    }\n",
        "    li {\n",
        "        margin-bottom: 1em;\n",
        "    }\n",
        "    h1 {\n",
        "        font-family: Helvetica, serif;\n",
        "    }\n",
        "    h4{\n",
        "        margin-top: 12px;\n",
        "        margin-bottom: 3px;\n",
        "       }\n",
        "    div.text_cell_render{\n",
        "        font-family: Computer Modern, \"Helvetica Neue\", Arial, Helvetica, Geneva, sans-serif;\n",
        "        line-height: 145%;\n",
        "        font-size: 130%;\n",
        "        width: 90%;\n",
        "        margin-left:auto;\n",
        "        margin-right:auto;\n",
        "    }\n",
        "    .CodeMirror{\n",
        "            font-family: \"Source Code Pro\", source-code-pro,Consolas, monospace;\n",
        "    }\n",
        "/*    .prompt{\n",
        "        display: None;\n",
        "    }*/\n",
        "    .text_cell_render h5 {\n",
        "        font-weight: 300;\n",
        "        font-size: 16pt;\n",
        "        color: #4057A1;\n",
        "        font-style: italic;\n",
        "        margin-bottom: 0.5em;\n",
        "        margin-top: 0.5em;\n",
        "        display: block;\n",
        "    }\n",
        "\n",
        "    .warning{\n",
        "        color: rgb( 240, 20, 20 )\n",
        "        }\n",
        "</style>\n",
        "<script>\n",
        "    MathJax.Hub.Config({\n",
        "                        TeX: {\n",
        "                           extensions: [\"AMSmath.js\"]\n",
        "                           },\n",
        "                tex2jax: {\n",
        "                    inlineMath: [ ['$','$'], [\"\\\\(\",\"\\\\)\"] ],\n",
        "                    displayMath: [ ['$$','$$'], [\"\\\\[\",\"\\\\]\"] ]\n",
        "                },\n",
        "                displayAlign: 'center', // Change this to 'center' to center equations.\n",
        "                \"HTML-CSS\": {\n",
        "                    styles: {'.MathJax_Display': {\"margin\": 4}}\n",
        "                }\n",
        "        });\n",
        "</script>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 1,
       "text": [
        "<IPython.core.display.HTML at 0x3fe4240>"
       ]
      }
     ],
     "prompt_number": 1
    }
   ],
   "metadata": {}
  }
 ]
}